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1. Introduction 

Over the past years, boosted by applications and computer performance, problems in high- 
dimensions have been explored in a number of statistical studies. If no additional structure is 
assumed, high-dimensional data processing suffers from some intrinsic difficulties such as the curse 
of dimensionality that results in a loss in the efficiency of statistical procedures, and inconsistency 
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of classical statistical procedures - even in the linear regression model - unless the dimension of 
variables is less than the sample size. 

In order to overcome the curse of dimensionality in a nonparametric framework, where typical 
functional classes are Sobolev, Holder, or Besov balls, some additional conditions, including 
additivity or tensor product structure, are assumed, see, for instance, [20, 6, 18, 14, 15, 16] and 
references therein. Even if one of these conditions is assumed, yet it is required that the sample 
size is to be larger than the data dimension. One way to free oneself from the latter condition is 
to impose an additional sparsity constraint. 

In this paper we focus on the problem of detection of high-dimensional signal functions in 
the Gaussian white noise model. To avoid difficulties stemming from high-dimensional settings, 
we suppose that a signal function satisfies an additional structural condition. Specifically, it is 
assumed to be sparse additive. This means that a high-dimensional function of interest is a sum 
of few univariate functions. Formally, we consider an d-dimcnsional (d G IN and d > 0) Gaussian 
white noise model 

dX(t) = f(t)dt + edW{t), t e [0, l] d , (1.1) 

where W{t) is the Wiener process, e > is the noise level, and /, the quantity of interest, is the 
signal function. The additive sparse structure means that / is the sum of d univariate functions 
ff 

d 

/(t)=E&/ife). ^e[o,i], (1.2) 

where the £j's are unknown but deterministic taking their values in {0, 1} : "0" means that the 
jth component fj is non active whereas "1" means that fj is active. Denote by K the positive 

d 

number of active components, that is, K = and assume that K = d 1_b , where b <G (0, 1) is 

i=i 

the sparsity index. If d l ~ b is not an integer then take K as its integer part. Denote by .7-^6 the 
functional class of additive sparse signals / of the form (1.2) with K = d}~ h active components 
and d b non-active components. Model (1.1) with the sparse additive structure (1.2) is a natural 
generalization of the sparse linear model: the nonparametric nature of the problem suggests to 
consider more flexible models. 

There is a huge statistical literature on estimation in sparse models, see, for instance, [1, 2, 3] 
and references therein. In particular, there are many works related to the well-known Lasso intro- 
duced by Tibshirani [21] in 1996. There are also a number of papers that deal with nonparametric 
estimation in sparse additive models. For a complete review of these topics, we refer to [19], where 
minimax estimation rates in sparse additive models are obtained, to [5], where the Lasso- type 
estimate in sparse additive models is studied, and to [20], where various structural assumptions 
on models in high dimensions are discussed. 

Back to our study, the detection problem at hand can be expressed in terms of a nonparametric 
hypothesis testing problem with the null hypothesis stating that "the signal is a constant", and 
"there is no signal" being a particular case of the null hypothesis. In order to specify an alternative 
hypothesis, recall that, within the minimax framework, it is impossible to detect signal functions 
that are "too close" to the null one, as well as to test the null and alternative hypotheses for too 
large alternative classes. Therefore, we are interested in the following nonparametric hypothesis 
testing problem: 

Hq : / = consto versus H\ : f = const\ + f , f 1 <G T&{t, r e , 6), (1-3) 

where 

{consto, consti are some constants, 
T d (T, r e ,b) ={f e F d ,b ■ Vj, fj € S T and ||/j|| 2 > r e }, r > 0, r £ > 0, 
S T = {/ G L 2 ([0, 1]) : £ f(t)dt - 0, ||/[|( T) < 1}. 
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The L2-norm || • H2 is used to separate the nonparametric alternative from the null hypothesis. 
The functional class S T is the Sobolev ball, expressed via the Sobolev semi-norm || • || 2 r \ that 
contains r-smooth functions, which are assumed 1-periodic and orthogonal to a constant. Due to 
the periodic constraints, it is possible to express || • || 2 r ^ in terms of Fourier coefficients; this will 
be done in Section 2. The quantity r is the smoothness parameter. Both the smoothness condition 
and the separation condition between Hq and Hi are expressed in terms of the components fj that 
are linked to the whole signal / via (1.2): each active component fj is smooth and is separated 
from the null hypothesis in the Z/2-norm by a positive value r e . 

In Section 6, we generalize the hypothesis testing problem (1.3) by considering a more general 
class of alternatives that consists of signals / equal, up to a constant, to a function f 1 £ -Fd.bi which 
is separated from the null hypothesis in the £2([0, l] if )-norm, and whose smoothness is expressed 
in terms of the whole function /. 

For these two hypothesis testing problems, the main questions are: what are the separation rates 
in the problem, i.e., what are the asymptotics for the minimal r e such that one can distinguish 
between Hq and Hi ? And, also, what are the optimal test procedures that provide distinguishability? 

To answer these questions, we use asymptotically minimax approach that provides detection 
boundaries or distinguishability conditions, i.e., necessary and sufficient conditions for the pos- 
sibility of successful detection; these detection boundaries yield asymptotics for the minimal r e 
separating the areas of distinguishability and non-distinguishability (between Hq and Hi). The 
asymptotics for the minimal values of r e are called either the (minimax) separation rates or the 
minimax rates of testing; in the present paper, the separation rates are denoted by r*. 

In connection with the current study, a number of works on detection and classifi- 
cation boundaries in Gaussian sequence models could be mentioned, see, for example, 
[7, 8, 9, 13, 12, 4, 15, 16, 11]. Also, in [17], rather than considering a Gaussian sequence 
model, the authors generalize the problem of finding a detection boundary in the linear regression 
model. Another paper [10] deals with the signal detection problem in a multichannel model in 
the functional framework. At the end of the next paragraph, we explain what are the differences 
between the results in [10] and our study. 

The main contribution of this paper consists of extending the results on detection boundaries 
obtained for d-dimensional sparse Gaussian vectors, see, for instance, [12], to the functional case. 
In particular, we obtain the same detection boundaries as in the vectorial case. However, in the 
case of high sparsity when b > 1/2, an additional assumption on the growth of d as a function 
of e is required. Distinguishability is possible when the sum of the type I error probability and 
the maximum over alternatives of the type II error probability vanishes asymptotically, and dis- 
tinguishability is not possible when this sum tends to one. Boundary conditions depend on the 
quantity a(r e ) = a{r e ,d,r), which is a solution of a certain extremal problem stated in Section 4. 
In the vectorial case, the quantity a(r c ) corresponds to the energy of a signal (see [12] and [10]). In 
the functional case, this quantity characterizes the distinguishability in a one-variable hypotheses 
testing problem. The minimax separation rates obtained in this paper depend on the value of b: 
for large b they are worse than for small b. Such a behaviour is expected because, with large 6, 
only few components are active, and hence the problem of distinguishing between the alternative 
and null hypothesis becomes more difficult. 

For the most difficult case of b £ (1/2, 1), not only separation rates, but also sharp separation 
rates, that include both rates and constants, are obtained. We also provide optimal test procedures 
for which minimax rates of testing are achieved asymptotically. Depending on the value of b, we 
propose two types of test procedures: one is of a x 2 type, the other one is related to a Higher- 
Criticism statistic introduced in [4] and based on the Tukey's ideas. In the case of b £ (1/2, 1), our 
test procedure is adaptive in the sparsity index b, see Remark 5.3. 

In the paper [10], which is focused on a similar problem of multichannel signal detection, the 
optimal rates arc obtained. In our study, we obtain sharp separation rates for b £ (1/2, 1). The 
main difference between the study of [10] and our work is in the quantity a(r e ) that characterizes 
the distinguishability: in our work, it is just a solution of a certain extremal problem, whereas in 
[10], it is obtained directly from the use of the respective test procedures. 
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The rest of the paper is organized as follows. Section 2 is concerned with the problem of finding 
detection boundary in a sparse Gaussian d-vectors model. In Section 3, we give a new formulation 
of the problem (1.3) in terms of sequence spaces. Section 4 is devoted to the description of the 
extremal problem that gives the distinguishability characteristics. The main results are stated in 
Section 5. In Section 6, we generalize the hypothesis testing problem (1.3) by considering more 
general alternatives. The proofs are given in Section 7. 



2. Detection boundaries in a vectorial Gaussian model 



Hypothesis testing problems for d-dimcnsional vectors, under the sparse conditions similar to the 
ones we use, were studied in [7, 12, 4]. Namely, let X = (Xi, . . . ,Xd) be a random vector of the 

form Xj = Vj + rjj, where r/j J\f(0, 1), j = 1, . . . , d, and 

d 

Vj =^a, a>0, ^{0,1}, A'^^d 1 -", b G (0, 1). (2.1) 

3=1 

Let Vd(a,b) C M. d be the set of all vectors v = (vi, . . . ,Vd) of the form (2.1). Then, the testing 
problem is stated as follows: it is required to test Hq : v = against the alternative Hi : v € Vd{a, b). 
Here the questions of interest are: what are the asymptotics for a — ad as d — > +oo for which the 
hypotheses Hq and H\ separate asymptotically? Also, what are the optimal test procedures that 
provide the distinguishability (or separation) of H and Hi? 

The answer to each question depends essentially on the sparsity index b G (0, 1), see [7, 12, 4]. The 
detection boundaries are expressed in terms of a, d and b: if b < 1/2 (moderate sparsity), then the 
distinguishability is impossible when ad}/ 2 ~ h = o(l), and it is possible when ad 1 / 2 ~ b — > +oo. This is 
achieved by the test procedure based on a simple linear statistic t = dr 1 !" 1 Yli=i Xi.lib > 1/2 (high 
sparsity), then the distinguishability conditions look as follows: the distinguishability is impossible 
when lim sup a/Td < f(b), and it is possible when liminf a/Td > <p(b), where Td = \/\og(d) and 
the function ip(b), b G (1/2, 1) is defined by 



k(»)=^T, l/2<6<3/4, 
^ V ' \if 2 (b) = V2(l - VT^b), 3/4 < b < 1. ^ ' 

Observe that the function ip is positive, continuous, and increasing in b G (0, 1]. 

The test procedure that provides distinguishability in the high-sparsity case is based on the 
Higher-Criticism statistics introduced in [4]. It is defined as Ld = maxLd(s), for any sq > 0, with 

s>s 



d 

L d (s) - = - ( 2 ' 3 ) 



where, here and later, $ stands for the standard Gaussian cumulative distribution function. Note 
that it suffices to take the maximum of Ld over a discrete grid of the form s; = uiTd, ui = Sdl, I = 
1, . . . , L, such that ul < V% and 5d = o(l) is small enough. 



3. Transformation of the statistical testing problem 



Consider the tensor structure of the space L 2 ([0, l] d ), i.e., L 2 ([0, l] d ) = L 2 ([0, 1]) <8> . . . <8> L 2 ([0, 1]). 
Then, the corresponding orthonormal basis (<^f)i e z d °f L 2 ([0, l] d ) has the form 

d 

4>t(t) = II * = • • • .*«*) e [0, i] d , l = (h,.-.J d )e z d , 

i=i 
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where (4>\)kez is an orthonormal basis of L2QO, 1]). It is assumed that (j>l = 1. For any (j,k) G 
{1, . . . , d} X Z, let us define <jq k as 

fc (*) = = 0fc(*j)» i = (0, . . . ,fe,o, . . . ,0), 

where k is the j'-th component of I. Observe that <^ = 1. Using the orthonormal system 
(0i,fc)(j,fc)e{i,...,d}xZi consider the statistics (xj)i<j<d = {xj,k]k G Z}i<j<d defined by 

= f 4>i k (t)dx(t) 

= Cj <t>k( t i)fj( t j) dt j+ eT l3,k 
J [0,1] 

= €j0j,k + eVj,k, (3-1) 

where the random variables rjj^ = ^ d cj)j k (t)dW(t) are i.i.d. real standard Gaussian random 
variables and 6 jtk = L Q1 , cf>l(tj)fj(tj)dtj. Set 9j = (9j t k)kez, and 8 = (0j)i<j<<J- 

Thanks to the periodic constraints, we may consider (^)fcez as the standard Fourier basis. 
Then the Sobolev semi-norm of fj can be expressed in terms of its Fourier coefficients as follows: 

WfrWP = ((2Trf T J2\ k \ 2T °lk) 1/2 - Therefore, the functional class F d (T,r e ,b) can be equivalently 
fcez 

represented as the sequence space Od(r, r e ,b): 

d 

e d (r, r e , b) = {9 = (piti, . . . , Odid) : = d 1 ^; :./ ■ \\ d\.», 9(r, r e )}, 

i=i 

where 

6(r, r e ) = J a (Z) : (2tt) 2t £ |fc| 2 ^ ^ < 1; £ <S£ > r 2 }. 

fcez fcez 

The testing problem of interest (1.3) can be rewritten in the form 

H :8 = versus H x : 8 e Q d {T,r e ,b). 

Denote by IPq and P-g the distributions under the null and alternative hypotheses, respectively. 
Also, denote by Pa, Varo, P@, and Var^- the expectations and variances with respect to Po and P-g, 
respectively. The notation Pg j , lEg j and Var^. also will be used: they are related to the distribution 
of the observations Xj = (xj.k)ke'z- 

For any test procedure ip, that is, for any function measurable with respect to the observations 
and taking its values on the interval [0, 1], let w(i/>) = Po(ip) be the type I error probability and 
let ® d (r, r e , b)) = sup ^%(1 — VO be the maximal type II error probability over the set 

06e d (T,r £ ,&) 

0d(r, r e ,b). Also, consider the total error probability 7(1/1, ® d (T, r e , b)) = uj{tp) + f3(ip, Q d (r, r e ,b)), 
and denote by 7 or 7(0^(7-, r e , b)) the minimax total error probability over Q d (r, r e , b), that is, 

7 = j(Q d (T,r e ,b)) = inf 7 (V,e d (r,r e ,&)), (3.2) 

where the infimum is taken over all test procedures. One can not distinguish between Hq and Hi 
if 7 — > 1, and distinguishability occurs if it exists ifj such that either y(i(), Q d (r,r e ,b)) — > or 
/3(ip, Od(r, r e , b)) = o(l) once ip has a given asymptotic level. 

The aim of this paper is to provide separation rates for the alternatives 0d(r, r e ,b) and to 
determine statistical procedures ip and/or ip a asymptotically of level a, i.e., Lo(ip a ) < a + o(l), for 
which these separation rates are achieved. 

By the separation rates we mean a family r* such that 
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7-H 

[ 7(^e d (r,e,6))-^0, and/or V a £ (0, 1) /3(^ Q , d (r, r £ , 6)) -> 
By the sharp separation rates, we mean a family such that 

!7 — > 1 if limsup — < 1, 

:* 
7(^,e d (r,r £ ,6))->0, and/or V a £ (0, 1) p(ip a , Q d (r, r e , b)) -> if limmf^->l. 

Typically, asymptotics for models like model (1.1) are given as e — > 0. However, we are mainly 
interested in high-dimensional settings when d — > +oo. Therefore, here and later, asymptotics and 
symbols o, O, ~ and >c are used when e — > and d — > +oo, except for the cases when it is explicitly 
specified, say, Od is used when d — > +oo. The notation A = B means that we use notation A for 
quantity B. 



if r± 

r* 



if 



+00. 



4. Extremal problem 

In this section, we explain what is the quantity a(r e ) that corresponds to the energy of a signal in 
the vectorial case. Only in this section, we assume that the observations have the form Xk = 0k + er]k 
for fc £ Z, where the rj^s are i.i.d. real standard Gaussian random variables. The quantity a(r £ ) 
denotes the solution of the extremal problem 

- -*» » { aHfef 1 £ 1 <"» 

and characterizes distinguishability in the minimax detection problem for one-variable functions 
lying in S T and separated from the null hypothesis in Li by positive values r e , i.e., for t G [0, 1], 
f(t) = $^M£(t) with / G S T and ||/|| 2 > r e . 

Namely, if a(r £ ) — > then the minimax total error probability 7(0(7-, r £ )) — > 1, and if a(r e ) — > 
+00, then 7(0 (r, r e )) -> 0. 

Furthermore, let 9* = 9*(r e ) be a sequence in ^(Z) that provides solution to the extremal 
problem (4.1). Set 

Mr.) = i^f, fcGZ. (4.2) 
2 a(r £ )e^ 



Suppose that 



a(r £ ) x 1, supw fc (r £ ) = o(l). (4.3) 
fcez 



Then, we get the sharp asymptotics 

7(e(r,re))=2$(-o(r 6 )/2) + o(l). 

For the reader's convenience, we give a sketch of the proofs of these results. The proofs arc based 
on the methods and results of Sections 3.1, 3.3, 4.3 in [13]. In the vectorial case in hand, we also 
describe the structure of asymptotically minimax tests. 

In order to obtain lower bounds, we consider the Bayesian hypothesis testing problem with 
the product prior distribution on 9, using the symmetric two-point factors: tt = J^J 7^, 7Tfe = 

fcez 

-(S-e k + 5g k ) for 9 e 0(r, r £ ), and 5 is the Dirac mass. Let P n be the mixture of measures ]Pg 
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over 7T. Observe that 

rlJP . rlJP . 

^((a*)* 6 *) = LI (**) = II exp(-^/2e 2 )co S h(x fe e fe /e 2 ). 
fcez fcez 

For the sake of simplicity, denote -7^7- = ((fffc)fcez). Since 7r(0(r,r e )) = 1, we have, see 

uxro 

Proposition 2.12 in [13], 

7 (6(r,r e )) > l-i^,|dF ff /dF -l| > l-^(E (dPJdP Q -l) 2 )^ 2 = \-\{{E {dPJ dPv) 2 )-!) 1 ' 2 . 

This yields 7(0(7-, r e )) — > 1 as soon as EQ(dP^ / dPo) 2 — > 1. Simple calculations and the inequality 
cosh(a;) < exp(x 2 /2) give 



E (dPJdP ) 2 = 11 EoidP^/dPo) 2 = 11 cosh((0 fc /e) 2 ) < exp j ± £ 
fcez fcez V fcez 



Therefore, providing the "asymptotically least favorable prior" of the type under consideration 
leads to the problem (4.1). 

Under assumption (4.3), taking the prior based on the extremal sequence in the problem (4.1), 
one can show that the Bayesian log-likelihood ratio is asymptotically Gaussian: 

logidPJdPo) = f-^rr + log(cosh(a; fc ^/e 2 ))) = -a 2 (r e )/2 + a(r e ) Ve + p e , 



fcez 



where r\ t — > 77 ~ A/"(0, 1) and p t — > in iPo-probability. The proof is based on Taylor's expansion, 
see Section 4.3.1 of [13]. This yields the sharp lower bounds. 

In order to obtain upper bounds, take a sequence q = (qk)kez such that qk > 0, J2k Ik = 1/2: 
and consider t q , a centered and normalized (under Pq) statistic of a weighted x 2_ type: 



fcez 



Consider also the test procedures tpH,q = ^t q >H- Observe that E$t q = 0, Varot g = 1, and t q are 
asymptotically standard Gaussian under Pq. These observations imply w(jpH,q) = $(— H) + o(l). 
Denote by n(9,q) and n(q) the following functions: 

«(M) = X>0fc, K(q) = K(Q(T,r e ),q)= inf K (0,q). (4.4) 

Then, 

E„tq = e- 2 K{6,q), V&r e t q = l|4r 2 V g ^ = 1 + 0{{maxq k )E e t q ), 

*—* k 

k 

and hence, by Chebyshev's inequality, (5{ipH, q , 0(t, r e )) — > when e _2 K(<7) — > +00 and H < 
ce~ 2 K(q), c e (0, 1). Under assumption (4.3), one can check that the statistic t q = t q — Egt q is 
asymptotically standard Gaussian under Pg such that Egt q = 0(1). Therefore 

0(i/>u, q , e(r, r £ )) < $(# - e- 2 K (g)) + o(l). 

In order to determine "asymptotically the best sequence" (<?fc)fceZj it suffices to find a solution of 
the following maximin problem: 

<z(r e ) = e~ 2 sup «(<?)■ (4.5) 
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First, we change the variables for v = (wfc)fegz and (pk)kez, where Vk = 0\/\/2, pt = V%Qk- Then, 
by convexity of the set 

V+ = {v G ii(Z) : v k > 0; (2tt) 2t £ k 2T v k < 2' 1 ' 2 - J> fe > 2- 1 / 2 r 2 }, (4.6) 

fcez fcez 

and using the minimax theorem, we get 

o(r e ) = e~ 2 sup inf /JpfcUfc = e~ 2 sup inf y^PfcUfc 

= e~ 2 inf sup y^Pfc^fc = e~ 2 inf (T^i; 2 ,) 1 / 2 
= mf (V 0t) 1/2 = o(r e ). 

Thus, asymptotically the best sequence (qk)kez is the sequence iu(r e ) = (u>fc(r e ))fc e z of the form 
(4.2), and the value of the problem (4.5) coincides with the value of the problem (4.1). Setting 
H = a(r £ )/2, we get the upper bounds and the structure of asymptotically minimax tests. 
Note that the above evaluations entail (see also Proposition 4.1 in [13]) that 

inf \ K (6,w(r e ))>a(r e ). (4.7) 

Moreover if l? 2 ) 1 ^ 2 is larger than r e , then K(6,w(r e )) becomes rather large. Namely, let us 
feez 

denote 

K(r e ,B)= inf k(9, w(r e )), B > 

6»ee(r,Br e ) 



Proposition 4.1. Let B > 1, then 

1 



-K(r £)J B) > B z a(r e ) 



Proof of Proposition 4.1. 



Set e(T,A,r e ) = {0 G J a (Z) : (2^) 2 ^ £ feeZ < ^ E fce z ^ > r 2 }, ^ > 0. Since 



6(r, Br e ) C 6(r, B, Br e ), we have 

inf K (#, W (r e ))> inf k(0, u)(r e )) = B 2 inf K (0, u,(r e )) > B 2 e 2 a(r e ), 

9e6(T,Br e ) 0e©(T,B,Br e ) Se9(T,r,) 

where the last inequality follows from (4.7). This completes the proof. 

The solution of the extremal problem (4.1) is obtained in Ingster and Suslina [13], Section 
4.3. Adapting the derivations on pages 146-147 of Section 4.3.2. in [13] to our case, we set C3 = 

— B(a,b), C2 = — B(b,c) and cq = — B(a,d), where B{-, •) is the Euler Beta function, a = — , 
4r 4t 8r 2t 

b = 1 + — , c = 2 and d = 3. 
2r 

Lemma 4.1. The solution of the extremal problem (4-1) is given by 

o(r e ) ~ (ci(r)) 1/2 r 2+1/(2T) e" 2 as r e -> 0, (4.8) 

where ci(r) = co7rc7 2 ( — )( 4t + 1 )/ 2t [ s a positive constant. (4-9) 

C3 

Remark 4.1. One mws£ note that r t — > is i/ie cmfa/ condition we need to obtain the asymptotic 
solution of (4-1)- In particular, it is not required that f->0 and Lemma 4. 1 is valid whatever the 
value of e > is. 
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Sketch of proof of Lemma 4-1- 

Following Chapter 4 in [13], observe that by setting Vk = 2 /y/2 for all k £ Z, one can transform 
the minimization problem under constraints (4.1) into the following one: 

v f = hif > vl, 

where V + is defined by equation (4.6). The space i*(Z) contains non- negative sequences lying 
in Zi(Z). Note that v 2 = e 4 a 2 (r e ). The convexity of the set V + assures the uniqueness of v\. In 
order to determine the solution, rewrite as in Section 4.3. in [13] the sequence (vk)kez as follows: 
Vk = vo((k/m), where £(y) = (1 — |y| 2T ) J(|i/|<i) and m > 0. By using the Lagrange multipliers 
rule, it is possible to obtain the following relations, as r e — > and m —> +oo: 

c 3 v m ~ 2" 1/2 r 2 , v 2 ~ c wgm, c 2 v Q m 2T+1 ~ 2- 1/2 (2tt)- 2t , (4.10) 

which entail the existence of v 2 satisfying v 2 ~ ci(r)rt +1 ^ T , and thus a 2 (r c ) ~ ci(r)e~ 4 ?'e +1 ^ r . 
If r e — ► 0, then the first and second relations in (4.10) entail that 

v x v 2 r: 2 x r 2 + l '\ (4.11) 

which implies that m — > +oo since the third relation in (4.10) yields m X v 1 ^ 2t+1 ' > x r^ 1 ^ . 

Remark 4.2. The form of function £ and relation (4-H) imply that supufc < vq = o(l). 

k 

5. Main results 

Depending on the values of b, we distinguish between two types of sparsity: the moderate sparsity 
case with b £ (0, 1/2] and the high sparsity case with b £ (1/2, 1). In each case, although being 
of different types, the "best" test procedures that achieve the separation rates are based on the 
X 2 -type statistics (ij)i<j<<2 determined in the same way as the "best statistic" t q of a weighted 
X 2 -type in Section 4. 

Let us introduce a general version of the x 2 ~type statistics of interest. For j in {1, . . . , d}, put 

h = £«*((— ) 2 -i)> (5.i) 

where (wk)kG'z is the sequence of weights such that > for all k in Z and X^feez w k ~ \- Set 
also 

tj,k = w k ({ X -f) 2 -l), (5.2) 

so that tj = tj t k- 
fcez 

Recall that Ta = y/logd (see Section ??). Similarly to (2.3) and for any u £ (0, y/2], let us define 
the statistics L(u) on which the Higher-Criticism type test procedure is built: 

d 

L(u)=C u Y,(*(t 3 >uT d} -3>o(uT d )), (5.3) 
i=i 

where 

$o(a?) = Po(tj>x) (5.4) 
C u = (d$ (uT«,)(l - ^(uTd)))- 1 / 2 . (5.5) 

Taking into account the sparsity condition, wc consider a particular sequence of weights 
(wk(r*))kez defined by equation (4.2) with r* = r*(b) being the separations rates depending 
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on b in (0, 1). Then, for all j G {1, . . . , d}, we consider the statistics tj^ as in (5.1) with the weight 
sequence (wfc(r*))/ c6 z, that is, 



f^ = 5>*te)((— ) 2 -i 



x i,fc • 
e 

fcez 

Also, denote by it, the normalized empirical mean of the tj^'s: 

1 d 

Similarly, replacing tj by tj_ b , consider the statistics L(u, b), C Uib , and $o,b defined by equations 
(5.3), (5.5) and (5.4) respectively, that is, 

d 

L{u,b) = C Uib J2(l {t . b>uTd) -® 0tb (uT d )), (5.7) 

C u , b = (d$ , b (uT d ){l - QotiuTd)))- 1 / 2 , 
$o,b( x ) = #b(^,& > a;)- 



5.1. Moderate sparsity 

In case of moderate sparsity, for any a € (0, 1), consider the % 2 -type test procedure: 

^ 2 =<6 = %>T a ), (5.8) 

where is defined in (5.6) and T a is the (1 — a)-quantilc of a real standard Gaussian random 
variable. 

Theorem 5.1. Assume that r e — > and Ze< a(r e ) fee given by (4-8). Then, the following results 
hold true. 

• (i) Lower bound. 

If a{r t )d 1 ' 2 - b = o(l), t/iera 7 -> 1. 
7/a(r £ )d 1 / 2 " h = 0(1), tfien liminf 7 > 0. 

• (ii) Upper bound. Let r* = r*(b) be determined by the relation a(r*) x d 6 ^ 1 / 2 and tp* be 
defined by (5.8). Then, 

Type I error: Va <E (0, 1), u>(if>^ ) = a + o(l). 

Type JJ error: if a(r e )d 1 / 2 ~ b — > +00, i/ien /3(V>a , Sd(r,r e , 6)) = o(l). 

Remark 5.1. iVoie i/iai we obtain the same detection boundaries as in the vectorial case (see Sec- 
tion 2): the areas of distinguishability and non-distinguishability depend on the limit of c? 1 / 2_b a(r e ) . 
The condition d 1 ^ 2 ~ h a{r e ) X a(r e )/a(r*) — > +00 is equivalent to r e /r* — > +00 where by (4-8) 

r* x (e 4 ^ 6 " 1 )^ 4 ^- (5-9) 

In order to use Lemma 4-1, the condition r t — > is required. Note that the requirement r* — > is 
always fulfilled for b <G (0, 1/2) whatever the value of e > is as soon as d — > +00. For b = 1/2, 
the condition r* — > holds when e — > 0. 
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5.2. High sparsity 

Let us define the Higher-Criticism type test procedure. Let r* = r*(b) be determined by the 
relation a(r*) ~ (p{b)T d , where ip{b) is given by (2.2). Set u(b) = min(2(p(b), V%), i.e., u(b) = 2<p(b) 
for b e (1/2, 3/4], and = y/2 for 6 £ (3/4, 1]. Consider the test 

^ L = J { max L(u l ,b l )>H}> «' = «( 6 0, 

1<KJV-1 

where the function L is defined in (5.7) and (&z)kz<./v consists of a regular grid on (1/2, 1], that is, 
bi = 1/2 + IS, where S is a positive parameter that satisfies S = 0^(1), T d 8 — > +oo and TVfS = 1/2. 
This entails that N = O d (8~ 1 ) and thus N = o d {T d ). Take a positive H such that H ~ (logd) c 
for some positive constant C satisfying C > -j. 
For a constant D > v2j consider also the test 

I max -ir 

^ — A { max max i, b , > _DT d V 

l<i<dl<i<Af J ' 

Finally, combining and ip max ^ we define the test procedure 

^ HC = ^ L i/} max , (5.10) 

that rejects H if both ?/' L and tfj max reject i? - 

For the high sparsity case, not only separation rates but also sharp asymptotics are obtained; 
two ranges of b should be distinguished: the range of b in (1/2, 3/4], called the intermediate sparsity 
case, and the range of b in (3/4, 1), called the highest sparsity case. 

Theorem 5.2. Assume that r e — > and that \ogd = o(e~ 2 /( 2r+1 )). Let a(r e ) be given by (4-8) and 
let if be given by (2.2). 

• (i) Lower bound. 7/limsupa(r e )/T £ i < ip(b), then liminf 7 — > 1. 

• (ii) Upper bound: errors of ip HC defined by (5.10). 

— Type I error: cu(ijj H ) = o(l). 

— Type II error: if liminf a (r e )/T(j > (f(b), then f3(ip H , Qd(r, r e ,b)) = o(l). 

Remark 5.2. • Set a(r*) = Tdip(b). In our sparse functional framework, the distinguisha- 
bility conditions are the same as for a d-dimensional sparse vector (see, e.g., [12]), with 
the only difference that in our case the assumption \ogd = o(e~ 2 / ( - 2T+1 )) is required. Un- 
der this assumption, the result of Theorem 5.2 means that distinguishability is impossible 
if lim sup a(r e )/a(r*) < 1 and it is possible if liminf a(r € )/a{r*) > 1. Due to (4-8), these 
conditions provide sharp separation rates since they are equivalent to lim sup r e /r* < 1 and 
liminfr £ /rg > 1, respectively, where 

r* ~ ( e 4 T 2 ( Cl (r))-V 2 W) T/(4r+1) , (5.H) 

and C\(t) is defined by (4-9). Note that the condition r* — > is fulfilled under the assumption 
logd = ( e " 2 /( 2T+1 )). 

The values r* mark the border between the areas of distinguishability and non- 
distinguishability. Indeed, for r t — > such that lim sup r e /r* < 1, the alternatives separated 
from the null hypothesis by r e are not distinguishable and, on the other side, for r t — > 
such that liminf r e /r* > 1, the alternatives separated from the null hypothesis by r e are 
distinguishable. 

• Actually, the assumption logrf = o(e~ 2 / < - 2r+1 ' ) ) is equivalent to 

(r*)V(ar) Td = 0(1), (5.12) 

which is required when dealing with the asymptotic behavior of the tail distribution oftjj, (see 
Lemma 7.1) since T d snpw k {r*) < (r*) 1 ^ 2 ^. Relation (5.12) follows from the relations 

k 

in (4-10). Concerning the lower bound, condition (5.12) is necessary when we evaluate the 
second moment of the Bayesian likelihood ratio under the null hypothesis. 
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• Note that the condition logd = o(e~ 2 / ( - 2r+1 -') is essential for b G (1/2, 1). Namely, it follows 
from Theorem 2 in [10] that j/liminf (logd g 2 /( 2T + 1 )) > 0, then the separation rates are of 
the form r* = e \/log d for any b G (1/2, 1). Observe that if logd > ce~ 2 for some c > 0, 
then the separation rates are bounded away from zero, i.e., it is impossible to detect functions 
lying in 0d(r, r e ,6) with small enough r e > 0. 

Remark 5.3. Adaptation. 

In the high sparsity case, a family of test procedures ip c provides the distinguishability for all 
b G (1/2,1). Moreover, it follows from the proofs that our result is uniform over b G (1/2 + p, 1 — p) 
for any p G (0, 1/4), i.e., the results are adaptive over b G (1/2 + p, 1 — p) for any p G (0, 1/4), 
without a loss in separation rates. 

For the moderate sparsity case, it is worth noting that the family of test procedures ipa = b 
depends on b G (0,1/2] since the sequence of weights w(r*(b)) does. It is shown in Theorem 
3 of [10] that "adaptive" separation rates for unknown b G (0,1/2) are of the form r* x 
(e 4 d 2fc_1 loglogd) r /( 4r+1 ' 1 , i.e., the adaptive case leads to an unavoidable loglog-Zoss in separa- 
tion rates compared to non-adaptive setting. Using the Bonferroni method, it is possible to prove 

2 

that the test procedures based on a grid of tests of the form ip* b are adaptive rate- optimal test 
procedures. Since this result is similar to the one stated in [10], we omit it. 

6. Extended problem 

In this section, we generalize the hypothesis testing problem stated in (1.3) to more general alterna- 
tives. The null hypothesis H$ is still characterized by some constant consto and, as in (1.3), under 
the alternative, the signal function / is, up to some constant, equal to / , i.e., / = consti +/ . The 
additive sparse structure on f 1 is still assumed, i.e., f 1 G J^d.b, as well as every component fj is as- 
sumed 1-periodic and orthogonal to a constant (recall that for any t G [0, l] d / x (i) = Sj=i €jfj{tj) 
where £j G {0, 1} and tj G [0, 1] for any j G { 1 , . . . , d} ). We then denote by jFd,b the set of signal 
functions in Ta.b whose components are 1-periodic and orthogonal to a constant. Rather than im- 
posing smoothness constrains component-wise, we now study the alternative classes for which the 
smoothness and separation conditions are expressed in terms of the whole signal function f 1 . In 
other words, the main difference between the extended and initial detection problems is that the 
distinguishability problem is studied with respect to a global signal. 

Then, given the alternatives that include signal functions / as in (1.3), where f 1 belongs to the 
functional class 7-"| xt (r, L, r e ,b), the testing problem of interest is stated as follows: 

H : f = consto versus Hi : f = const 1 + f 1 , f 1 G FI x1 {t, L,r e ,b), (6.1) 

where 

FT\r,L,r t ,b) = [f 1 G F d , b : \\f% > r e , H/ 1 ^ < l) , 

in which (||/ 1 ||2 T ' > ) 2 = S^=i £j(\\fj \\^ ) 2 - -Due to the periodic constraint, we consider the stan- 
dard Fourier basis. This allows to express the semi-norm || • \\^ in terms of Fourier coefficients. 
As in Section 3, we then transform the functional space J-"^ xt (T, r e , L, b) to the sequence space 
0^ xt (r, L, r e ,b), which consists of sequences 9 = (£j@j,k)j,k such that 

j=i kez, 
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Note that if L 2 = K and f 2 = Kr 2 , then we have 

e e d xt (T,L,f e ,b)D<d d (T,r e ,b). 

This implies that the results on the lower bound continue to hold for 8?'(r, L, f e , b) with the 
separation rates (f*) 2 = K(r*) 2 , where r* is defined by either (5.9) or (5.11) depending on the 
values of b. Here, the quantity of interest is a(r e ), the solution of the following extremal problem: 

d 

E^E^^ 

j=i fcez 

As follows from Section 4.3 in [13], the solution of the extremal problem (6.2) is given by 
a(r £ ) ~ (c 1 (r)) 1 / 2 ^r 2 + 1 /(2-) e -2 as re _> , 

where Ci(r) is defined in (4.9). That is, a(r e ) = Ka(r e ), where a(r e ) is the solution (4.8) of the 
extremal problem (4.1). 

Remark 6.1. Consider the function k defined by (4-4), for which the sequence of weights w(r e ) = 
(wk{r e ))k is defined as in (4-2). Then we obtain from (4-7) that 

1 d 

inf y^ K (e ]} w(r e )) > S(r e ) = Ka(r e ), (6.3) 

and similarly to Proposition 4-1 for any D > 1. 

1 d 

inf i r^(« j ,iD(r t )) > D 2 a(r t )=D 2 Ka(r t ). (6.4) 

e e e| a!t (T,ii' 1 /2,Di<:i/2r < ,,6) e ~J 

Now, as in Section 3, with the use of the orthonormal system, instead of considering the random 
process X(t) defined in model (1.1), we observe a family of random sequences (xj,k)k£Z,j&{\,...,d} 
defined by (3.1). Finally, the remained question is: do the families of test procedures tpa given by 
(5.8) and tp H given by (5.10) provide distinguishability? The answer is affirmative and is given 
below. Note that it is then sufficient to study the type II error probability of these tests since their 
type I error probability has been already studied for the hypothesis testing problem (1.3). 

Theorem 6.1. Assume that r e — > and let a(r e ) and ip be given by (4-8) and (2.2), respectively. 
Then, the following results hold true. 

• (i) Moderate SPARSiTY-T?/pe II error probability of ^ defined by (5.8). 
If a(r t )d 1 ' 2 - b -> +oo, then f3(ip^ 2 , Q d xt (r, if 1 / 2 , K 1 / 2 r t , b)) = o(l). 

• (ii) High SPARSITY- Type II error probability of ip HC defined by (5.10). 
Assume that \ogd = o(e~ 2/(2T+1) ). 

If lim inf a(r e )/T d > <p(b), then P{ip HC , Of *(r, K 1 / 2 , K x l 2 r t , b)) = o(l). 

Remark 6.2. One should note that the detection boundaries are the same for the hypothesis testing 
problems (1.3) and (6.1), the initial one and its generalization. 

7. Proofs 

Proofs of our main results require some preliminary results that are stated below both under the 
null and alternative hypotheses. Specifically, we establish asymptotic tail distributions of the test 
statistics in hand and find their first and second moments. 
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7. 1 . Properties of test statistics 

In this section, we consider the statistics tj defined by (5.1) with any sequence of weights w = 
(u)k)kez such that Wk > 0, Vfc £ Z and J2k w k = V^- Therefore the quantities L(u), C(u), and $o 
are those defined by (5.3), (5.5) and (5.4). 

Proposition 7.1. Asymptotic tail distribution of tj defined by (5.1). 
Assume Tmax^ Wk — o(l), then 



logP (tj > T) ~ -— as T -> +00, 



I: 

2 



1 to ^ t>\ — Ee^tj)) 2 t^+oo 
logP ej (tj > T) ~ , as (T - E Sj (tj)) — ► +00. 



Proof of Proposition 7.1. 

We consider only the distribution Pe., since -Po is a particular case of Pe, . The proof consists 
of bounding Po j (tj > T) from above and below. This is done by using the cumulant-generating 
function of tj under Pe which is defined by <f>g.(h) = log(Eg j (exp(htj))) for any h. Let us consider 

dPe ,h 

only positive h and let us introduce a new family of probability measures Pg . h such that — = 

dP 

exp(htj) exp(—<pej(h)). This yields 

Pe, (tj > T) = E e] . h [l {tj>T) eM-(htj - ^ (h)))} 

= cxp(-(hT-<f> ej (h))) E ejth [l it . >T) exp(-h(tj-T))}. (7.1) 

Let us start with the upper bound. 

Upper bound. The second term o the right-hand side of (7.1) is less than 1. Hence there is a 
straightforward upper bound on Pg, (tj > T) : 

P 6j (tj>T) < exp(-(hT - (j> ej (h))). (7.2) 

To complete this part of the proof, it remains to determine the minimum value of a positive value 
h on the right-hand side of (7.2). The minimum is attained for positive h such that 

lEe.Ah) = T ( 7 - 3 ) 

since 

'^(h) - hT)' = E e . !h (tj)-T, 

<f> ej (h) - hT)" = Var 0j , h (tj) > 0, 

where (•) and (■) denote the first and second derivatives with respect to h, respectively, and, 
JEg-h and Vaig.,/, are the expectation and variance with respect to Pg jt h- 

Q ■ u 

In order to find h that solves equation (7.3), we need to determine <pg j . For this, set Vj k = 
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Then for any positive h such that h — > +00 and /imaxic). = o(l), we obtain 
<t>Bj (h) = log J| Eg. [cxp(hw k ((vj,k + Vj.kf - 1))] 

k 

= y^{-/twfc + hw k v 2 jk {\ + 2hw k + o{hw k )) 
k 

-^-2hw k -^fl+o(h 2 W l))} 
= ^{hw k vl k (l + o(hma,xw k )) + Irnj:!^, + 1) + o{h 2 wl)} 

k 

= hE ej (tj)(l + o(hmsxw k )) + ^-{l + o(l))+o(h 2 ), (7.4) 

k A 

where the last equality sign in (7.4) follows from (T — Eg.(tj)) — > +oo and T maxwt = o(l) as 

k 

T —> +oo. Next, differentiating the right-hand side of (7.4) with respect to h yields 

(<p 0j (h) - hT)' = => h~ T - E 0j (tj), as T - E 6j (tj) goes to infinity. 

As (T — Eg.(tj)) — > +oo, this leads to the following optimal upper bound for right-hand side 
of (7.2): 

exp (iT-E ej it j ))E ej (t j)+ ( r ~^fe)) 2 -TiT-Eg^)) ~ exp(- 
Since by assumption T m.axw k = o(l), the condition h maxwt = o(l) with (T — Eg.(tj)) in place 

k k J 

of h is fulfilled. 

By assumption T maxu^ = o(l), hence the optimal upper bound under P$ is exp( — g-) as T goes 
to infinity. This completes the proof of the upper bound. 

Lower Bound. We are interested in obtaining a lower bound for (7.1). This is done by first 

considering a new family of probability distributions under which the normalized statistics tj are 
proved to be asymptotically Gaussian. 

For h > satisfying equation (7.3), let us introduce the following probability measures Eg jt h, k - 

'///', .;,./, 



dP 



exp(htj tk ) exp(-<f>g j k (h)), 



with tj >k defined in (5.2), cf>g j k (h) = log Eg, k (exp(htj ik )) and where Eg j k stands for the expectation 
with respect to the observations (xj tk )j. k of (3.1). Denote by Eg jt h,k and Var^.^^ the expectation 
and variance with respect to Pej.h,k- 

To establish the asymptotic normality of tj, we will check that the Lyapunov condition is 
satisfied. To this end, set <r 2 hk = Vaxg jth ,k(tj,k) and a 2 h = Y\. cr? fcjfc . 

Denote by 4>g 2 \ and 4>g A \ the second and fourth derivatives of <pg j k with respect to h, respectively. 
Using well-known relations between moments of tj under -ffV,.h,fc and the successive derivatives of 
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4>e s k {h) with respect to h, in particular, a 2 h = J2k ( re\i we S e t 

< 4maxK)E fc ^(l + o(l))+o(l) 
1 

where the last relation follows from maxu^ = o(l) and relation (7.4), since by (7.4) we get 
^ fe (/)g 4 ' fc (/i) = <Pg)(h) = o(l). The Lyapunov condition is then satisfied. This implies that under 

tj - Be* h(tj) 

Pg Zj h = is asymptotically a real standard Gaussian random variable. 

Let us return to relation (7.1), where h is chosen to have Eg^h{tj) = T, and observe that 

IBe^hWt^T) exp(-h(tj -T))] = E 6j ih >o) expf-ftZ^cr,-,/,)]. 

Due to the asymptotic normality of tj, for any <5 > 0, 

EojA^Zj^o) exp(-hZ jih a jih )] = Eg ]:h \l {Zj he{0 j y) cxp(-hZ :j j l a j j l )} + 

Ee j>h [l {Zj<h> 5)e>c9{-hZ^ h a jth )\ 
> {Pg j:h {Z jM g (0, 5)) + o(l)) exp(-Woj-, fc ). (7.5) 

By choosing S = o(h) in relation (7.5) implies that 

]og(JP 0j {tj > T)) > <j> 6j (h) -hT~ o(h 2 ). (7.6) 

Up to o(h 2 ), the right-hand side of (7.6) corresponds to the argument of the exponential function 
on the right-hand side of (7.2). This entails that the right-hand side of (7.6) is equivalent to 
(T-E e ,(tj)) 2 

— . This completes the proof of the lower bound, and thus Proposition 7.1 is proved. 

Lemma 7.1. • (i) Expectation and variance oftj defined by (5.1). 

®eM = Zje- 2 K(6j,w), (7.7) 
Va,T 6j (tj) = l + 0((max w k ) E 9j (tj)). (7.8) 

• (ii) Expectation and variance of L{u) defined by (5.3). Assume that T^maxwfe = o(l) and 
consider any 6 = (£i#i, . . . ,£,d8d) such that = d l ~ b . Moreover, assume that for all 

nonzero £j, Eg (tj) > cT^, with some positive c, and max Eg . (tj ) = 0(T c i). Then, for all 

ue (0,V2\, 

%(£(«)) > d^" fc +(4- (( "l' +)2 )(i+°(i))(i + (1)), 
Varg(L(u)) = o(dP %(£(«))) + o(l), X] = o(l), 
where x+ = max(0,x). 

Remark 7.1. Under Eq, the statistics tj and L(u) have zero mean and unit variance. More- 
over, under Eq and the assumption maxwk = o(l), the statistics tj are asymptotically stan- 

k 

dard Gaussian. Under Eg^ the statistics tj — Eg^tj are asymptotically standard Gaussian if 
maxu>fc Eg.tj = o(l), see Lemma 3.1 in [13]. 

k 



Proof of Lemma 7. 1 . 

(i) Recall that ^^wf, = 1/2. For each index j satisfying £j = 1, the random variable (—^-) 2 is a 
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JP^-noncentral x 2 (l, 6 2 k e ~ 2 )- From this relation (7.7) is easily obtained. Relation (7.8) is deduced 
from the following calculations: 

Var^t,-) = 5>*(2 + 4e-V!J 
fcez 

feez 

= 1 + 0(maxw k e~ 2 £,jK(8j,wj) 
= 1 + O (max w k E Sj (tj ) ) . 

(ii) For any it € (0, \/2], as T d — > +oo, Proposition 7.1 gives a control over C M defined by (5.5): 

7 ,2 T 2 -7/ 2 T 2 
Cl = d- 1 exp(^(l + (l)))(l-exp(^-^(l + (l))))- 1 

Since u < the exponent of d in C M is o(l). 

Case 1: for the nonzero ^ 's, assume that umsup(uT(j— Eg. (tj)) < +oo. In this case, the probability 
Pe (tj > uTd) = Pe (tj — Eg. (tj) > uTd — Eg (tj)) is bounded away from zero. This follows from 
the asymptotic normality of tj — Eg,(tj) for Eg.(tj) = 0(Td) (sec Remark 7.1) 
Case 2: for the nonzero ^'s, assume that uTd — Eg^tj) — s- +oo. Then, for any nonzero £j, 
Proposition 7.1 implies that 

log P 6j (tj>uT d ) > - {uTd ~ cTd) \ l + o(l)). 



Recall that the number of nonzero £j is equal to K = d 1-6 and that for all nonzero , IEg. (tj) > 
cTd for some positive c such that maxj : ^ = i Eg.(tj) = 0(Td). To sum up, the cases 1 and 2 entail 
that 

Eg(L(u)) = C u ]T (Pe 3 (t 3 > uT d ) - $>v(uTd) 

> C U K { d-^^ 21 ^ ^ - d-^(i+°(i)) 



= d -i+ 3 #(i+°(i))+i-6 ^-^±^(1+0(1)) _ d -4 (i + (i)) 

= d^ b +(4~ li ^ ± ^)(i+°(i))(i + (i)). 
Similarly, let us study the variance of L(u). Using Proposition 7.1, we obtain 

Var w (L(u)) = C 2 U ^ P B] (t 3 > uT d )Pg ] (t ] < uT d ) + C 2 U £ $o(uT d )(l - $ (uT d )) 

= GlKP Bi (tj > uT d )(l + o(l)) + (d*- 1 + d- b )(l + o(l)) 
= (C u Eg(L(u)) + d b - l )(l + o(l)) 
= o(d r >E s (L(u))) + o(l), n = o(l). 

7.2. Upper bound 

Remark 7.2. Note that the condition Td maxu>fc(r*) = o(l) follows from assumption \ogd = 

o(e~ 2/(2r+1) ). i n d ee d, Remark 4-2 and relations (4-10) imply that T d ma,xw k (r*) < (r^) 1/(2T) r d; 
where the term on the right-hand side goes to zero as soon as logrf = o(e~ 2// ( 2r+1 * 1 ). Therefore, 
assumption loge? = o(e _2 /( 2r+1 )) allows us to apply Proposition 7.1 and Lemma 7.1. 
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Proof of (ii) -Theorem 5.1. 

Type I error probability of ip* . It follows from the Central Limit Theorem that, under the 
null hypothesis, tb is asymptotically a standard normal random variable. Therefore 

Po{h>T a ) = $(-T Q ) + o(l) =a + o(l). 

Type II error probability of ifr£ uniformly over Od(r, r t ,b) for r e > Br*, B > 1. Thanks 
to Lemma 7.1, uniformly over 6 G 0^(r, r e ,6), we have 



1 d 



(2 
3 = 1 

This implies that Var^-(ft) = o{{Eg{tb)) 2 ) provided that JEg(tb) — > +oo. Let us study JEg(tb): 
from Proposition 4.1, Lemma 7.1, and relation (4.7), we get uniformly over 0d(r, r e ,6) with r e > 
Br*, B>1: 

Mg(t b ) > d 1/2 ~ b B 2 a(r*) ->• +oo as soon as B 2 d 1/2 ~ b a(r*) x S 2 -> +oo, i.e., as soon as rjr* -> +oo, 

where r* x (e 4 d 26 - 1 ) T /( 4T+1 ). 

Due to (7.9), using Markov's inequality and Lemma 7.1, for all 9 in ®d{T,r e ,b), 

Pg{t b <T a ) = Pg{t b -Eg{t b )<T a -IEg{tb)) 
< Pg{\t b - Eg{t b )\>Eg{t b )-T a ) 

Vargfo) = 
" (E ¥ (t b )-T a )2 ° [ >■ 

This entails that (3(ip* , Od(r, r e ,b)) goes to zero as soon as d 1 / 2 ~ b a(r £ ) — > +oo, i.e., as soon as 
->- +oo where a(r*) x d 6 " 1 / 2 . 

o(r*) 

Proof of (ii)- Theorem 5.2. 

Type I error probability of 4> HC • Observe that w(ip HC ) < w(il> L ) + w(ip max ). The assumption 
log(d) = o(e~ 2 /( 2r+1 )) implies that Td maxj Wk(r*) = o(l). Therefore the application of Proposition 
7.1 and the fact that D 2 > 2 and N = o(T d ) yield 



= #V max max * i)6 , > DT d ) < V V JP fo, h > DT d ) 

~ ~ J=l i=l 

< 7Vdcxp(-L> 2 T ( 2 /2(l + o d (l))) = Arrfi-^ 2 /2(i+ 0d (i)) ^ 
By Lemma 7.1 and applying Markov's inequality, 

JV-l 



,{^ L )^P a { max L{u h hi)>H) < V P (L{u u bi) > H) 

Var (L(ui,6/)) 



i=i 

JV-l 



z=i 

< (^-^ 

which goes to zero as d — s- +oo since _ff ~ {\ogd) c , with C > | and N = Od{Td). 
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Type II error probability of ip HC uniformly over S<j(t, r ei b). For any 6 Qd{i~, r e , b), we 
obtain 

%(1-^ C ) < mm^l-V^),^!-^)), (7.10) 

E _ {l _^ma X) < ^^^(f^ < OTd ). (7.11) 

First, let us consider the alternatives 9 G Od(T, r e , 6) such that for a nonzero £j, there exists I G 
{1, ... , iV} for which lEg.tjfy > D\Td with Z?i > D. From Lemma 7.1(i) and Markov's inequality, 
we obtain 

Pe 3 (t jM < DT d ) < Pg 3 {\t jM - E 8] (t jth )\ > E 8] {t jM ) - DT d ) 

Var gj (t jM ) 

< 7^—77 \ ™T2=0(1). (7.12) 

{EeAhh) - DT d) 

Second, in view of (7.10), (7.11), (7.12), it suffices to study the test procedures ip L under the 
alternatives 6 £ 0d(r, r e ,b) such that max max JEg.tAh, — 0(Td). Then we obtain 

j:^=l 1<1<N 3 J 

E s (l-ili L ) = Fg( max L(ui, bi) < H) < min P^(L(u h bi) < H). 

a 1<1<N~1 X<l<N-l 

For any I G {1, . . . , N - 1}, 

P$(L(ui,bi) < H) < P-giLimM) - Eg{L{uiM)) < H - Eg(L(u h h))) 

< Pg(-\L(u h bi) - Eg(L(u u bi))\ < H - Eg{L(ui,bi))) 

< P-g{\L{uiM) ~ lEe(L(u h bi))\ > -H + Eg{L(ui,bi))) 

< V&Tg(L{Ul,bl)) 

- (Eg(L(ui,bi)) - H) 2 ' K ■ ' 

For any bi G (1/2,1), if we prove that _ inf P)^(L(ui,bi)) goes to infinity as a power of d 

ee& d (T,r e ,b) 

(d — > +00), then Lemma 7.1 and the choice of H (recall H = Od{(\ogd) c ), with C > 1/4) yield 
the result since in this case the right-hand side of relation (7.13) goes to zero. 

Third, for b G (1/2, 1), take an index I in {1, ...,N — 1} such that h < b < 6;+i. This, combined 
with the continuity of ip, yields 

6, = 6 + o(l), r* e {bi)<r* e (b)~r*(h), o(r*(6 { )) < a{r*(b)) ~ o(r*(6 { ))- 

Let 9e e d (r,r e ,6) with fee (1/2,1) and liminf(a(r e )/o(r*(6)) > 1. Then r e > (l + 5)r*(6/) for 
some S > 0. Proposition 4.1 entails that for j such that £j = 1 we have 

Eofab, > (1 + <5) 2 a«(M) ~ (1 + S) 2 a(r* e (b)) ~ (1 + S)\(b)T d . 
We then derive from Lemma 7.1 with c = c(6) = (1 + (5) 2 <^(6) that 

inf Mg(L(ui,h)) > di+^~ b J ( 1 +°( 1 ))(l + o(l)). (7.14) 

eee d (r,r c ,t) 

Finally, denote the main term in the exponent of d in (7.14) by 

M - 1 | ^ b ) 2 6 (W-c(6))+) 2 
2 4 2 

To obtain the result, it is sufficient to prove that M is positive and bounded away from zero for 
any S > 0. 

Intermediate sparsity case. This case corresponds to b G (1/2,3/4]. Recall that it(&) = 2<^i(&), 
where tpi is defined in (2.2). Then 
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M > U ^ | ^ 2(6)/2 > Q for 5 > V2 - 1 ' 

The latter inequalities are obviously satisfied. This leads to the result. 
Highest sparsity case. In this case b G (3/4, 1) and u(b) = V2- Then 

M > fl «. / + ^ ~ Vf»Q>) > for (1 + 5)\ 2 {b) < V2 

\l-b>0 for (l + 5) 2 ip 2 (b)>V2 ' 

Again, the latter inequalities are satisfied, and the result follows. 



Proof of (i) -Theorem 6.1. 

Similar to the proof of part (ii) of Theorem 5.1, due to (6.3) and (6.4), uniformly over 6 G 
Q d xt (r, K 1 / 2 , K 1 / 2 r t , b), the type II error probability of goes to zero as soon as r e /r* — > +oo. 

Proof of (ii)- Theorem 6.1. 

The proof of the fact that the type II error probability of tp HC goes to zero as d — >• +oo is similar to 
the one of Theorem 5.2. Recall that K = d}~~ h is the number of nonzero £j's and suppose without 
loss of generality that ^ = 1, Vj G {1, . . . , K} and £j = 0, Vj G + 1, . . . , d}. Note that relations 
(7.10) and (7.11) remain valid for any_0 G Q e d xt (r, K 1 / 2 , K 1/2 r e , b). 

First, similarly to (7.12), for any G Q d xt (r, K 1 ' 2 , K 1 ' 2 r t , b) such that for the nonzero £j's, 
there exists I G {1, . . . , N} for which Eg.tj >bl > D\T d with Z?i > D, the type II error probability 
of ip HC vanishes asymptotically. Therefore, it suffices to study the test procedures tp L under the 
alternatives 9 G ® e f t (T,K 1 / 2 ,K 1 l 2 r t ,b) such that max max Ea.t. jb] = 0(T d ). Therefore, let us 

take 5 > and consider the alternatives that are as far away from the null hypothesis as r e such 
that r E > (1 + S)r*(b), where r*(b) is determined by a(r*(b)) ~ T d tp(b). 

Second, for any I G {l,...,N}, observe that the only difference between the proofs of the 
extended and initial problems lies in the study of 

K 

inf V P 0j (tjfy - E Bj [t jM ) > uiT d - E 6j (t jM )). (7.15) 

0ee« t «(-r,-K' 1/2 >-K 1/2 r e ,6) J^J J ■ J 

Now it is no more possible to control (7.15) by using Lemma 7.1 (ii) because the condition 
lEoj (tj) > cTd is not necessarily satisfied for all nonzero £j's. In fact, the only condition we have is 

K 

Eg j (tj) > cKTd with some constant c > 1. 

i=i 

Let us now explain why the current proof is reduced to the study of (7.15). As in (7.13), we get 
for any 6 in Qf *(r, K 1 ' 2 , K x l 2 r e , b), 

tp i ti u\s m s ■ VnTg{L(ui,bi)) 
JPs( max Hui.bi) < H) < mm . — . — j — -^-r — . 

9 \<i<N v ; - ; - i<i<n (Eg(L(u h bi)) - H) 2 

Due to Lemma 7.1 and the fact that H = 0<j((logd) ) with C > 1/4, in order to obtain the result, 
it remains to prove that for any I such that bi < b < _ inf Eg(L(ui, &/)) d ^t^° 

' eeB" d J:t (r,K 1 / 2 ,K 1 / 2 r 1 ,b) 

+oo as a positive power of d. Finally, recall that 

K 

Eg{L{u u bi)) = C UtM {Pb^M - lE 8j (t jM ) > uiT d - E 6] {t ]M )) - $ , fci (u,T d )) (7.16) 

i=i 

where C UlM = (d$ , bl {uiT d )(l - $o,fc ; (u/T^))) 1 / 2 and $0,6,(2;) = P (t jM > x). The term on the 
right-hand side of (7.16) corresponds to the product of (7.15) and C Ulibl . The quantity C M|> 6, is 
controlled by Lemma 7.1 and Proposition 7.1. Thus it remains to study (7.15). 

Third, the application of Proposition 7.1 gives the following approximation of (7.15), 



£>:/,.,. - E 6] (t 3M ) > UlTd - E 6] (t, bl )) = X>p(- (W E f^ ))+)2 W ). 

3=1 3=1 
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Recall that a(r t ) given by (4.8) is the solution of the extremal problem (4.1). Set rjj = lEgJtjfy), 

rjo = {l + 5) 2 a(r* e (b)) ~ (l + 5) 2 a(r* e (bi)) and f T (n) = exp(-^2^) Vr? e [0, R], where R = R(T) > 
will be specified later on. Consider also 

K K 

Fk,t(vo) = inl X] f^iVj) subject to^ rjj > Kr) . 
i=i j'=i 

Due to relation (6.4), we have for the sequence w(r*(bi)) that 

K K K 

Then, in order to obtain the same right-hand side as in (7.14), it is sufficient to show that for 
any I in {1, . . . , N} such that T = uiTa, relation (7.17) which is stated below, holds: 

F k ,t{Vo) = Kf T ( Vo ). (7.17) 

This is handled by a technical result similar to the one stated in Lemma 7.4 and Lemma 7.5 in 
Ingstcr et al. [17]. The proof of Lemma 7.2 is postponed to Section 7.4. 

Lemma 7.2. Set A = (T - T] )f T (vo)- 

If 0< m <T-1 and T < R < T + ((T - ? ?0 ) 2 - 21og(l + 2(T - %) 2 )) 1/2 , (7.18) 

i/ien 

inf (/ T (r?) - At?) = him) - A770, (7.19) 

which implies that 

F k ,t{vo) = Kf T { m ). (7.20) 

As d -> +00, for any i G {1, . . . , N} such that T = uiT d with m > (l + 5) 2 (^(fo) and R = pT d with 
ui < p < ui + y(b) , the conditions in (7.18) are then satisfied. Therefore the application 

of Lemma 7.2 yields the results since for all 8 G Q d xt (r, K 1 / 2 , K 1 / 2 r t , b), 

*HL M ) > C UlM K (ex P (- «»' T ' - (1 + f« )0(1) - exp(-^(l + o(l))) , 
which corresponds to the right-hand side of (7.14). 
7.3. Lower Bound 

The prior distribution we consider is a classical one for a functional Gaussian model. In Section 
4.3 of [13] it is referred to as the symmetric Three-point Factors. 

Prior. Before defining the prior n d formally, we shall start with an informal discussion. 
The prior II adds mass on {£j@j)i<j<d- the components are i.i.d. and £j and 0j are supposed 
to be independent. A natural choice for £j is a Bernoulli with a parameter p d G (0, 1) such that 
JE(Ylj=i £j) ~ The Oj's are binary random variables (with probability 1/2) such that 6 2 = (0*) 2 
where the sequence 9* is a solution of the extremal problem (3.1); this guarantees that 8j belongs 
to 6(r,r e ). 
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Now, we define the prior distribution more precisely. Let pd be any sequence of positive numbers 
such that pd d ^+$° and d 1 ~ b p d +oo, V& € (0, 1), Vs > 0. Consider two sequences and 

{®j,k)j,k of independent random variables whose distributions are the following: 

~ Bernoulli B(p d ) with p d = d~ b {l + p d ),j G {1, .. ., d}, 
0j, k = £j,k e z k , with P{e j>k = 1) = P{e jM = -1) = \, j £ {1, . . . , d}, k £ Z. 

The sequence (zfc)fc G z is deterministic and is defined as follows: (e z k ) k — (#£)fc e z = 9* where 9* 
is the sequence that leads to the solution (4.8) of the extremal problem (4.1). In particular, this 
entails that 

Ey = « 2 (^)' ( 7 - 21 ) 

fcez 

(2tt) 2 ^ \k\ 2T (ez k ) 2 < 1, (7.22) 
fcez 

fcez 

The sequences an d (@j,k)j,k are also taken mutually independent. For each j in {1, . . . , d}, we 
define the prior distribution ttj on (£j,0j) as follows: 

7if = (1 - Pd)$o + Pd J| 7Tj,fc = (1 - Pd)5a + Pd*j, (7.24) 
feez 

where 7Tj = flfcgz 7r j.^' ^j'i* = « **) a*,)) puts niass on 0j ik and 5 is the Dirac mass. Finally, 
we define the global prior H d by 

n d =rW 

3=1 

Minim AX and Bayesian risks. Denote by JP n .i the mixture of the measures Pg over the prior 
U d , and let j(Q) be the minimal total error probability for testing a simple null hypothesis Hq : 
P = Pq against a simple alternative Hi : P = Q regarding the measure P of our observations 

(Xj,k)k£Z.l<j<d- 

Proposition 7.2. 

7>7(^n*) + o(l), (7-25) 
where 7 is the minimax total error probability over Qd(i~,r e ,b) (see (3.2)). 

Proof of Proposition 7.2. 

Consider two sets S(s) and S + (s) defined by 

d 

5(a) = {C = e (6M)t,-,(d (£d,kZk)k) ■ E& = s }' < s < d ' = IJ S W- 

j=l s<i<d 

First, due to relations (7.22) and (7.23), B(K) is included in Od(r, r e , &). This entails that 

7 > 7(H(*Q). (7.26) 

Second, let us introduce some additional priors: for any subset u C define ir u = 

Yljeu^j rii^w ^0) where 7Tj is as in (7.24). Note that 7r„ has a support on the collections 
C = e (£i(ei,jfc2fc)k,...,£d (£d,kZk)k) with = 1 if and only if j e u. For any integer s such 
that < s < d, let £7<j,s be the set of all subsets it C {1, . . . , d} of cardinality s, and define 7iv s \ as 
the uniform distribution on <7d iS : 

*■« = 7dT E 7r «' 
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Observe that the prior H d is of the form H d = J2i=o r s 7r (s) where r s = p s d {l — pd) d ~ s - Clearly, 
7T( A -) )) = 1, which implies 

j(E(K))>j(P n(K) ). (7.27) 

Third, consider the conditional prior of the form LI^ with respect to S(X) + , i.e., 

Tl d (A n "(K) + ) d r 
n^(A) = ±j— g— L ' which is of the form 11^ = ^ g s vr (s) with q s = , K < s < q. 

Let us prove that 

l{P, {K) )>l{P n% )- (7.28) 

Denote by Xk = {{xj,k)j,k '■ — (( x j,k)j,k) < 1} the admissible set of the optimal test for 

aJPo 

testing Hq : P = Pq against H\ : P = Ptt {k) ■ Since 

7 (F 7r(A . ) ) = 1 - P (X K ) + P nK) {X K ) and i{P n *) < 1 - Po(Xk) + P^ + (X K ), 

proving (7.28) is then reduced to checking that 

P^ {K) {X K )>P U% {X K ). (7.29) 

In view of Proposition 2.5 in [13], Xk is a convex set. Also, the set Xk is sign-invariant and 
invariant with respect to all permutations of the Xj^'s; the measures P-n {s)1 < s < d have the 
same property of invariance with respect to all permutations of the Xj^'s. These observations imply 

d 

Pn m (X K ) = P^k [X k ], P n % 0*k) = ^ F e s 



s=K 



where 8 = e(z, . . . , z, 0, . . . 0), z = (zfe)fcez. Since 9 s - k > 9* k > 0, V j, k, s > K, the application 

s 

of Lemma 2.4 in [13] entails that Pg K (X K ) > Pgs(X K ), s > K. This yields relation (7.29) and 
hence relation (7.28). 

Finally, in view of Proposition 2.11 in [13], it remains to check that 

1 {P n )= 1 {P ud ) + o{\). (7.30) 

Similarly to the proof of Proposition 2.9. in [13], it is easily seen that (7.30) follows from the 
relation 

U d (E+{K)) 1. (7.31) 

Acting as in the proof of Proposition 3 in [12], we obtain by Chebyshev's inequality, 

i-n d (s+(;q) = n d (^- < d 1 -*) 

= U d (d Pd -J2^>dpd-d 1 - b ) 
d l - b {l + Pd ){l-d- b {l + Pd )) 



< 



(d^Pd) 2 



where the ratio on the right-hand side tends to zero as d goes to infinity. Relation (7.31) is then 
proved. 

As relations (7.26), (7.27), (7.28), and (7.30) imply (7.25), the proof of Proposition 7.2 is com- 
pleted. 

Due to Proposition 7.2. the proof of the lower bound is reduced to bounding 7* = j(Pnd) from 
below. 
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Before studying 7*, we introduce some useful notation and make some helpful remarks. Denote 
by || • || tv and || • || 2 the distance in variation and the ^-distance between any pair of probabilities 
(P,Q); the latter one is defined by 

II p rill 2 — / ^ ^ does n °t dominate Q, „ 9 s 

^-^-\]E P {L-lf if P dominates Q, ^ j 

where L = — — is the Radon-Nikodym derivative of Q with respect to P. 
dP 

Remark 7.3. Note that 

• II ' 1 1 TV" = || ' ||i > where || • ||i is the Li-distance. 

• If P dominates Q, then \\P - Q\\\ = E P {L 2 ) - 1. 

• As stated in Proposition 2.12 of [13], 

V\\P- Qh = o(l), then \\P - Q\\i = o(l), 

If \\P — QII2 is bounded, then limsup ||P — Q||i < 2. 

Using Remark 7.3, one has 

If \\Po - PnAh = o(l) then \\1P - P-n^ = o(l) and 7* 1. (7.33) 
If \\P - P n d\\ 2 = 0(1) then limsup \\P - P U d ||i < 2 and liminf 7* > 0. (7.34) 

Therefore, if needed, the -^-distance can be conveniently used instead of the total variation dis- 
tance. 

Due to (7.33) and (7.34), it remains to study ||JPo — Pnd\\2 which is expressed in terms of the 

Bayesian likelihood ratio L U d = — (see relation (7.32)) . 

dlPo 

Likelihood Ratios. Here and below, when it is not absolutely necessary, we omit the arguments 
of the likelihood ratios. Then, observe that Ln d is defined by: 



L n d 



- n/ ( ^ 

3=1 

d 

= W^-Pd+VdLj), 

3=1 

where Lj is the likelihood ratio between lP 1Tj and IPq. Denote also by L^d the likelihood ratio 
between IP^d and IPq , i.e., L^d = (1 — +pdLj). Then Lj is such that 

J fcez 

1 / z 2 z 2 

I 9 CX P("T + z kXj,k/e) + exp(-^ - z k x jtk /e) 



2 V 2 ' " v 2 

fcGZ v 

zl 



= I cxp( — cosh(z k x jtk /e), (7.35) 

fcGZ 

where cosh is the hyperbolic cosine. Using routine calculations, in particular, using twice the 
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inequality 1 + x < exp(x), Va; £ R, we obtain 

Eo{Ll f ) = l+p 2 {E (L 2 )-l} 

= l+P^n( 1 + 2 ( sin My)) 2 )-l} 



fcez 

< l+^{cxp(^2( S inh(|)) 2 )-l} 

fcez 

< exp( P 2 {exp(]T2(sinh(|)) 2 )-l}), 

fcez 



where sinh denotes the hyperbolic sine. In view of Remark 7.3, in order to study ||JPq — Jfn* 1 1 2 3 it 
suffices to study Eo(L U d — I) 2 - The latter includes the quantity fEo(L^j d ) that satisfies 

^) = II^) ^ CX P (dptlcMY, 2(sinh(|)) 2 ) - 1} J. (7.36) 

j=i V fcez / 

As d goes to infinity, the right-hand side of (7.36) goes to one provided that 



d v\ (cxp(A) - I) d ±±T with A = 2(sinh(^ )) 2 . (7.37) 

fcez 

Proof of (i) -Theorem 5.1. 

Recall that by assumption b £ (0, 1/2]. We shall distinguish between two cases depending on the 
values of r e with respect to r* defined in (5.9). 

Case 1: rjr* = 0(1). Since dp 2 d {a{r*)) 2 = 0(1), it follows that dp 2 d a 2 {r e ) = 0(1). Since dp\ is 
bounded away from zero, a 2 (r e ) = O(l), and, due to Remark 4.2 and relations (4.10), we have 

2 4 

sup fe z 2 = o(l). This entails that sinh {-^) ~ -rS which, due to (7.21), implies that A ~ ^ 4^ ~ 
a 2 (r e ), and hence ^4 = O(l). It now follows that exp(vl) - lxA Finally, we get 

dp 2 (exp(A) - 1) x rfp 2 a 2 (r e ) - 0(1), (7.38) 

and the second part of (i) in Theorem 5.1 is proved. 

a 2 (r e ) 

Case 2: r e /r* = o(l). Due to (7.38), we have dp 2 , (cxpM) — 1) x <ip5a 2 (r e ), and since e = 

a z {r*) 

o(l), relation (7.37) is trivially fulfilled. 
Proof of (i)-Theorem 5.2. 

Now by assumption b £ (1/2,1). Due to the condition Iog(d) = o(e~ 2 /( 2T+1 )), Remark 4.2, and 
relations (4.10), sup fc z 2 = o(l). As in the moderate case, this yields A ~ a 2 (r c ), and thus we obtain 

dp 2 (exp(A) - 1) = dp 2 exp(a 2 (r e )(l + o(l))). (7.39) 

Again, we shall consider two cases depending on the values of r e with respect to r*, where r* is 
now defined by (5.11). 

Case 1: Suppose that r e /r* = o(l). Then a(r e ) = o(T d ). Due to equation (7.39), this implies that 
relation (7.37) is fulfilled. 

Case 2: Suppose that r e /r* = O(l) and let c(r e ) be a positive constant satisfying c 2 (r e ) log(d) = 
a 2 (r e ). Then the right-hand side of (7.39) can be rewritten as follows: 

dp 2 cxp(a 2 (r e )) = d l ~ 2 \l + p d ) 2 exp(log(d)c 2 (r e )(l + o(l))) 
= rf 1 - 2h + c2 ^)( 1 +°( 1 »(l + p d ) 2 . 



imsart-generic ver. 2011/11/15 file: Corrected-Gayraud-Ingster-Second-Round-Submitted.tex date: July 24, 2012 



G. Gayraud et al. /Detection of sparse functional signals 26 



Therefore, relation (7.37) is fulfilled provided that c(r e ) < y/2b — 1 = <pi(b), where tp\ is defined 
in (2.2). This means that a successful detection is impossible if c(r e ) < <pi(b), which corresponds 
to the intermediate sparsity case; in fact, the inequality c(r e ) < fi{b) is valid for any b <G (1/2, 1) 
but it could be improved for b £ (3/4, 1). Indeed, for b £ (3/4, 1), one can show that a successful 
detection is impossible if c(r e ) is such that c(r £ ) < ^2(6), where the function ip2 is defined in (2.2), 
and for b £ (3/4,1), ipi(b) < f2{b)- That is why the improvement is possible and is achieved by 
dealing with a truncated version of the Bayesian likelihood ratio L U d. From now, let us consider 

a(r e ) = c(r e )\/\og d with —j= < c{i\) < \pi. The case c(r e ) < —j= coincides with the intermediate 
sparsity case when b E (1/2, 3/4]. 

Thus, for some positive v, let us define L U d, the truncated likelihood ratio of L U d\ 

d d 

L nd = J] L w , = ]J(L^)l^ a{rc)V ^^ dy (7.40) 



i=i 3=1 



where 



Also put 



l j =log(L j ) + ±a 2 (r e ). (7.41) 



h = log(^), (7.42) 

where Lj is defined by (7.35). Now we introduce two new probability measures P Vj and 
expressed in terms of P as follows: 

a tp 

(7.43) 
(7.44) 



dP Vj 


exp(lj) 


dP 




dP H 


exp(2lj) 


dP 





In order to get a lower bound for the minimax total error probability, it is sufficient to prove (see 

the proof of Theorem 4.1 in [11]) that E {{L n d - l) 2 ) = o(l), where L n d is defined in (7.40) 
provided that 

d 

Po(f] ih < o(r e )V(2 + «)logd}) ->■ 1. (7.45) 

3=1 

In fact, it is enough to prove that 

d 

Po(lj > a(r c )y/(2 + v)logd) ->■ 0. (7.46) 

3=1 

Relation (7.46), and hence relation (7.45), follows from relation (7.47), which is a part of the next 
lemma whose proof is postponed to Section 7.4. 

Lemma 7.3. Assume that r t -> and logd = o(e" 2 /( 2r+1 )). IfT > is such that T = 0(a 2 (r e )) ; 
then 

P (lj>T) < cxp ^--I^+o(a 2 (r e ))y (7.47) 
Moreover, if liminf(T/a 2 (r e )) > 1, then 

P„Ah > T) < cxp (- (r 2a 2(^ ))2 +°( Q2 (^))) . ( 7 - 48 ) 
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and i/limsup(T/a 2 (r e )) < 2, then 

P,Ah<T) <cxp(- (r 2fl 2 ^ ))2 + (a 2 (r e ))). (7.49) 

Next, it remains to prove that Eo(L n d) — > 1 and -2?o((Xn d ) 2 ) — > 1. This will entail the expected 
result that E a ((L U d - l) 2 ) = o(l). 

First, consider the term Eo(L n d): 

iEb(L n <0 = n| =1 2Eb(vo 

= n^ =1 iEo(i+p d (ij - 1) - ipjMLj - i) + 1)) 

= n£ =1 (i -pd^o^i^-)) + (-i+p d )JP (^)) 

d 

= exp(£log (l-pdiEoiLjlvj)) + (-l+p d )]P (V~)) , (7.50) 

3=1 



where X>j = {L < a(r e )-y/(2 + i>) log d} and 2?j denotes the complement of Vj. Relation (7.46) 
entails the convergence to zero of the second term in the log term of the right-hand side of (7.50). 
Therefore, in order to obtain Eo(L U d) — ^ 1, it is sufficient to prove that 

d Pd {Ev{L 3 l w )) = o(l). (7.51) 



Note that lE (Ljl^-) = P Vj (Vj). Since — - — 1 is positive (c(r c ) < \/2) for any positive v, we 

3 c \ r e) 

can applied relation (7.48) of Lemma 7.3 to get 

dp d P yj {V~) < dp d cxp f-i log(d)(( V / 2+^ - c(r e )) 2 + o(l))^ 

= d l -\l + p d )d-^^-< r ^ + °^\ (7.52) 



where the right-hand side of (7.52) goes to zero as soon as c(r e ) < V2 + v — y/2(l — fe). This yields 
relation (7.51). 

Second, we need to study E^L^): 



d 



Eo(.L 2 nd ) = Y[E ((l-p d (l-L j )) 2 l Vj ) 

3=1 

= CX P fc •oeC 1 - 2 P^o((l - L 3 )1 V] ) + E (p 2 (l - L J ) 2 1 V] - %-)) 

Since the relations dPo(Vj) = o(l) and dpdEo((l — Lj)Jx>.) = o(l) have been already proved, it 
is sufficient to show that dp 2 Eo((l — Lj) 2 lj).) = o(l). To this end, observe that 

dE Q (p 2 (l-L 3 ) 2 l Vj ) < 2(dp 2 E (V 3 ) + dp 2 Eo(L 2 l V] )). (7.53) 

The first term on the right-hand side of (7.53) tends to zero as d goes to infinity since dp 2 , = 
dd~ 2b (l + Pd ) 2 for fee (3/4,1). 

To study of the second term on the right-hand side of (7.53), we take into account the following 
two points: 

(i) since sup fe z\ = o(l), we can apply Lemma 7.4 of Section 7.4 with h = 2, X = a^fc/e, and 
z = Zk, and obtain 

exp(2Z~) = exp(2a 2 (r £ )). (7.54) 
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(ii) since limsup(T/a 2 (r e )) < 2 is satisfied as soon as c(r e ) > ^ with T = a(r e )y/(2 + v) \ogd, we 
can applied relation (7.49) of Lemma 7.3, which jointly with relation (7.54) leads to 



dp 2 d E (L 2 l V] ) = dd- 2b (l + Pd ) 2 P H {l ] < a(r e ) v/logd y/(2 + v))exp(2lj - a 2 (r e )) 

< dd- 2b (l + Pd ) 2 x 

/ a 2 (r e ){V2TT,y/]Eid-2a(r e )) 2 „ „ 
cxp ^ ^ L^LL + a 2 (r e ) + o(a 2 (r £ )) 

= dd- 2b (l + Pd ) 2 exp(-^((V2T^ - 2c(r £ )) 2 + c 2 (r £ ) + o(l))) 



dcT 26 (l + / , d )2 d -|(7( 2 ^- 2 ^)) 2 +- 2 ^)+°(i). (7.55) 



The expression on the right-hand side of (7.55) goes to zero as soon as c(r e ) < \/2 + f — \/2(l — b). 
The last inequality is obtained by resolving the inequality 1 — 2b — \{\/2 + v — 2x) 2 + x 2 < 0, 

where x is constrained to be larger than This implies that a successful detection is impossible 
as soon as c(r e ) < f2(b), where y> 2 is dchncd by (2.2). 



7-4- Appendix 

7.4-1- Proof of Lemma 7.2. 

If there exists A such that (7.19) is valid, then equation (7.20) is obtained in adapting Lemma 
7.4. 's proof of [17]. Indeed, 
v 3 r€ [0,R],je{l,...,K}: 



7.4. 's proof of [17]. Indeed, due to (7.19) and using the fact that Y^j=i Vj > KVOi we obtain for all 



K K 

3 = 1 3 = 1 

> K(f T (vo) - Ai») + Aif7? 

= #/t(t?o). (7.56) 



On the other hand, 



if 



< Kf T ( m ). (7.57) 
Relations (7.56) and (7.57) yield relation (7.20). 

Now, let us prove that (7.18) implies (7.19). For this, set griv) = /tO?) — A77 and denote by g' T 
and the first and second derivatives of gx, respectively. Note that g T \rf) = (T — ??)/t(??) — A, 
and we choose A = (T — t?o)/t(??o) to have g' T (go) = 0. 

The study of g^ yields that g^ ] > for \T - n\ > 1 and g^ < for \r) - T\ < 1. Since 
< r/o < T — 1, this implies that ^ < on [0,r?n[, g'rivo) = 0, g' T > on }r/o,T — 1], is 
decreasing on ]T — 1,T + 1], and g' T is increasing on ]T + 1, +oo[. Moreover, g' T (T — 1) > and 
g' T {T) = —A < 0, so that there exists t E]T — 1,T[ such that g' T (t) = 0. This yields that t]q is a 
local minimum of gx- In order to prove that t]q is a global minimum of gT, it is sufficient to show 
that gr{R) — gr(vo) > 0- Let us set R = T + x, with a positive real x. We already know that 
x <T-t] since g T (T + (T - 770)) = /x(%) - A(T + T-r/ ) = /t(%) - At? - 2A(T - 770) < gr(vo), 
where the last inequality is valid because of the choice of A and T — tjq. For x < (T — 770), we obtain 

x 2 

g T (R) - g T (go) = exp( — -) — (T — m)h{m){T + x)- f T {vo) + (T - ?7o)/x(»7o)f?o 

> exp(-y)-/ T ( ? 7o)(2(T- ?/0 ) 2 + l)>0, (7.58) 
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where inequality (7.58) is valid as soon as 

exp(-y) > cxp(- (T ~ ??o)2 )(2(T - r /0 ) 2 + 1) «*• x < ((T - m ) 2 - 21og(2(T - Vo ) 2 + I)) 1 ' 2 . 
Since (7.18) implies (7.19), this completes the proof of Lemma 7.2. 

7.4.2. Proof of Lemma 7.3. 

The proof of Lemma 7.3 requires an additional result stated as Lemma 7.4 below. For any j G 
{1, . . . , d}, recall that lj and lj are given by (7.42) and (7.41), respectively. For any j G {1, . . . , d} 

-42 2 

and k G Z, set lj t k = % — +log(cosh(£fc:rj l fc/e)) and = — ^ + log(cosh(zfcXj i fc/e)). Denote by 
Aj, Aj, and A^fe the moment-generating functions of lj, lj, and lj >k under IPa, respectively. From 
equations (7.35) and (7.41), it turns out that for any h, 

= Uhk(h), (7.59) 
feez 

Aj(h) = Aj(h)exp(h^-y (7-60) 

z 4 z 2 

Next, define the function g : (z,y) —> — — + log(cosh(2:y)), and observe that the following 

relations hold: 

l jt k = g(zk,Xj tk /e), 

A jtk (h) = Eo(exp(hg(z k ,x i>k /e))). (7.61) 

Lemma 7.4. Let X be a real standard Gaussian random variable. For any z = o(l) and any 
h = 0(l), 

log(E(cxp(hg(z, X)))) = h 2 ^ + o(z 4 ). 

Proof of Lemma 7.4- 

For some 5 > 0, consider the event £ = {\zX\ < i5} and denote by £ its complement in 
R. We shall study the expectations G\{h,8) = E{ex.p{h\og{cosYi{zX)))\s) and G2{h,S) = 
J?(exp(ft,log(cosh(zX)))l^) separately. At this point, we choose i5 small enough (5 = o(l)) to 
satisfy z8~ l = o(l). 

First, let us study the term G-iih, S). With the use of the inequality cosh(x) < exp(|x|), V.t G R, 
and the fact that h = 0(1), the routine calculations of exponential moments of a real Gaussian 
random variable lead to 

G 2 (h,8) < E(exp(h\zX\)l (lzXl > s) ) 
= 2E(eMhzX)l {x > 5/z) ) 

cxp( — [x — hz) 2 ) 1( x> s jdx exp(— h 2 z 2 ) 



2-7T Jr+ 

< 2cxp(/i 2 y)cxp(--(--^) 2 ) 

< 2exp(-^ + (l)V (7.62) 

where, with our choice of 6, the right-hand side of (7.62) is small. 

Now, we move on to the term Gi(h,5). If 5 is small enough, then \zX\ is also small and the 
following relation holds: 

2 4 

log(cosh(zX)) = yX 2 -^I 4 + o(/l 4 ). (7.63) 
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Then the routine calculations of exponential moments as above lead to the following: 
Gi(M) = I?(cxp(Vy* 2 -^ 4 (l + o(l)))) *s 
= E (cxp(My* 2 ))(l - h Z -^X\l + (1)))1 £ 
= exp(-l log(l - hz 2 )) exp(~z 4 (l + (1))) 
= exp(^z 2 + ^z 4 (l + (1))) cxp(-^ 4 (l + o(l))) 

= exp(^ 2 + ^z 4 -^ 4 + (z 4 )). (7.64) 

Taking h = 0(1), z = o(l), 5 = o(l) and z<5~ 4 = o(l) in relations (7.62) and (7.64) entails that 
Gi(M) = 0(1), G 2 {h,5) = 0(cxp(-<5 2 /(2z 2 )) = o(l), and therefore G 2 (h, S)(Gi(h, S))- 1 = o(l). 

Next, due to (7.62), (7.64) and using the fact that h = O(l), z = o(l), for small 5 such that 
zqS^ 1 = o(l) and 8 = o(l), we obtain 

log(E(eMhg(z,X)))) = log(G 1 (h,S) + G 2 (h,S))- f ^(z 2 -^) 

= ao gGl (M)^(^4)) + iog ( i + ||M|) 

^ 4 { " { G 1 (h,S){h*$+o(z*)Y 
= h 2Z -^+o{z% (7.65) 



where relation (7.65) holds provided that 

G 2 (M) 



o(l). (7.66) 



Gi(/i, S)(h 2 ^- + o(z 4 )) 

It is then sufficient to prove (7.66) since (7.65) is the expected result of Lemma 7.4. Recall that 
h = 0(1) and z = o(l) entail that Gt(h,S) = 0(1) and G 2 (h,6) = 0(cxp(-<5 2 /(2z 2 ))). Then, 

1 S 2 

it is sufficient to establish that exp(— — — )z 4 = o(l). The latter holds if we choose 8 such that 

2 z 

5- 1 =o((zVlog(^ 1 ))- 1 ). 
Proof of Lemma 7.3. 

Remark 4.2 and relations (4.10) imply that supz 2 < z 2 = o(l) as soon as log(d) = o(e~ 2 ^ 2r+1 )). 

k 

Due to (7.61), for any h such that h = O(l), Lemma 7.4 can be applied to the moment-generating 
function Aj ! fe(/i). 

Here and later, we consider any j £ {1, . . . , d} and any k £ Z. Due to relations (7.59), (7.61), 
(7.21), (7.41), by applying Lemma 7.4 and using the exponential Chebyshev's inequality, we obtain 
for any positive h such that h = O(l), 

P Q (lj>T) < Aj(h)exp(-hT) 

< exp(y a 2 (r e )-^T + (a 2 (r £ ))). (7.67) 

T 

The minimum on the right-hand side of (7.67) is attained for h = — - which is positive and of 

a 2 {r t ) 

order 1; this allows us to prove relation (7.47). 
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Due to relations (7.61), (7.59), (7.21), (7.41), (7.43), by applying again Lemma 7.4 and using 
the exponential Chebyshev's inequality, we obtain for any positive h such that h = O(l), 

P^{lj>T) < ^(exp(;>))exp(-^r) 

= k j (h + l)eyq>(-^^--hT) 

= exp(^±^a 2 (r e )-^-/ l T + (a 2 (r e ))) ! (7.68) 

T 

where the minimum in the right-hand side of (7.68) is attained for h = — - — 1 which is positive 

a 2 (r e ) 

and of order 1; this yields relation (7.48). 

Recall that under the assumption of Lemma 7.3, the quantity 2a 2 (r e ) — T is positive. Therefore, 
from (7.61), (7.59), (7.21), (7.44), (7.41), and (7.60), applying Lemma 7.4 and using the exponential 
Chebyshev's inequality, we get for any positive h such that h = O(l), 

P H {h<T) = F w (-r,->-T) 

= iE' (cxp(-[,/i)cxp(2f i ))cxp(-a 2 (r e ))(A J (2))- 1 cxp(ft,r) 

= A,- (2 - h)(A j (2))- 1 cxp(-a 2 (r e ) + Th) 

= A,-(2 - h)(A J (2))- 1 exp(a 2 (r e ))exp(-a 2 {r c ) + Th) 

= exp Q(2 - h) 2 a 2 (r e ) - 2a 2 (r e ) + Th + o{a 2 (r t ))^j , (7.69) 

T 

where the minimum in the right-hand side of (7.69) is achieved for h = tt, — r + 2 which is 

a 2 (r e ) 

positive and of order O(l); this yields relation (7.49). The proof of Lemma 7.3 is completed. 
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